-
Notifications
You must be signed in to change notification settings - Fork 345
Implement conversion from LaTeX to our Markup XML #4787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reimplements most of `bin/latex_to_unicode.py` within the new library. More tests are needed, and some conversions done in `latex_to_unicode` are still missing.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## python-dev #4787 +/- ##
==============================================
+ Coverage 93.49% 93.67% +0.17%
==============================================
Files 35 35
Lines 2675 2782 +107
==============================================
+ Hits 2501 2606 +105
- Misses 174 176 +2
🚀 New features to boost your workflow:
|
@davidweichiang If you have a minute, I’d appreciate if you could have a look at this — I have tried to port the logic of our The test cases are here: https://github.com/acl-org/acl-anthology/pull/4787/files#diff-e559d67d054b0d61eb1f86a702d5373d2ea14dc6e1ff04aee432e7bcc6e912b3 |
This reimplements functionality from
bin/latex_to_unicode.py
within the new library, needed for #4766.Work in progress. Works in principle, but needs much more test cases to ensure feature parity with the previous implementation. Also, some normalization steps (as done in the old
latex_to_unicode()
function) are not yet ported.bin/latex_to_unicode.py
implements some heuristics to determine if e.g. % or ~ are LaTeX symbols or plain text — we should add that somehow, maybe as a parameter to the conversion functions?